Resolving syntactic ambiguities with lexico-semantic patterns: an analogy-based approach


  • Simonetta Montemagni
  • Stefano Federici
  • Vito Pirrelli

A sys tem for the resolutio~ of synLa.(> Lie a.mbiguities is illusl;ra.teel whi(:h op era.tes on ,norifl/o-synl;act;ica.lly a.u:d)iguous st, I~,iect-objecl; a.ssig,, nmnts i , 1Lal Jan a.ud tries t.o find the . ,os t likely aua.lysis (Ill [;[/e I)asis ()[' I;h(: cvid('m:c co,,ta,ined it, a knowledge of linguistic data. a, ut, otna, tica.lly ('xtra,et(~d from on-liue re'.sources. The system works o . I;he ha.sis of a. set of sl.ra.ight;forward analogy based princil)les, ll;s performa.nce on a sulssta.ntial corpus of l;est da.I;a. (~xl;ra.ct;e(I from real texts is des(:ril)ed. 1 I n t r o d u c t i o n In this pal)e.r, a syst(;n, for the resolut;ion of synCa.el;i(: ambigui t ies is illusl:rated: S e , S O K the SEl:naNtic Sul)je(:l;-OI)j('x:l: (lisaml)igua.(.ol{()perales o,t n,orpho., synl;a.cth:a.lly aml)iguous sul)jectobject assignmenl.s in I t;a.lia.n a.,(l tries to tind the most likely analysis I)y using the (widence conta.ined in a. knowhxlge of linguistic da.ta, automa.t.ically e.xtra.(::ted firom machine reada.I)le (lictiouaries (Ml{.I)s), bolJ~ ta,xo, lomie in['orma.tion and ('xaml)le s(mt;e.(:es. The syste, n works on the I)a, sis of a set o[' st.ra.ightforwa.rd a]m.logy-lm.sed 1)tin ciples whi(:h I)e(m used for a wide range of NLI ) a.lfl)licatio,s (I)irrelli el: a.l., 1!).()2; Montemagni et a.l., 1994; I,'e.derici et al., 19()6). Both inherent semanti(: l)rOl)erdes of words (as embodied by t axonomica l relat ionships) and word (listriI)utiona.l l)rOperties (as a.ttest<;(I in cxanq)le sentences) are exl)loited as a clue l;o the most likely '1 b.lecI. {.)l>j(,,t. /~8~iglltttc:n(. ( S ( ) A ) . We start, with a.n i l lustration of the pa.rsing problem, I:o move on to a ( :onsideratio. o[' the nature of the lexieo-sema.ntic knowledge usa.Me for its solution. SenS()l/ . 's Knowledge (KI3) is then described, l.ogel, her with the f imcdon which projects mnbiguous SOAs onto KB in the search for the best ca.ndidate a.mdogue.. 'Pwo (; tests of S(mSOll.'s perforln~mce are illustra.ted and discussed in some detail. I,'inaJly, fur ther improvements sketched a.nd other possible a.pplic~-~tions of tlm sys tem envisaged. 2 T h e p r o b l e m A crucial p rob lem in pa.rsing I ta l ian is tll(; assignmenL of subject a.nd object relations to sentence const i tuents . It, is ot'te.n the case tha t grmmna.l:ieal relat ions cannot be a.ssigned unambiguous ly on the basis of morpho-syn tac t i c informa%ion only: in the sentence il bambino leggc il libro ' the child reads the book ' agreement informat ion is not decisive for SOA sin(y, both nominal const i tuents agree with the w;rb. On the other hmtd, word o f der informa, tioll cannot be relied on con(:lusively due to the f reedom allowed in the ordering o[' sentem:e const i tuents in Italian, where vir tual ly all pe rmuta t ions of verb, subjeel; ~utd object possi hie. The and)iguities sl ;emming fl:om this []:e+'dom are ubiqui tous and represent a p rob lem fbr a.ny NI,P sys tem dealing with Ira]Jan, a. p roMem to whose resolution a wide wwiety of fa.ctors, bol;h linguistic (i.e. phonological , morphologi(:al, s y , tactic, lexieo-semmltic a.nd p ragmat i c ) a.nd ext.ralinguistic (i.e. ba.sed on world knowledge), contril)utes, llere, we (:oncentral:e on how morl)hosynta,etiea,lly ambiguous SOAs can I)e solved on the basis of lexico-semantic knowh'dge; in pa.rtieular, the focus is on the texieo-sema.ntic restr ict ions tha t a. verb or a noun imposes on its context . 3 O n t h e n a t u r e o f l e x i c o s e m a n t i c k n o w l e d g e '[ 'he lexico-sem~:mdc knowledge used for our pur-poses consists of typical Verl)-Sul}jecl,/Object ( V S O ) c o o e c l l r r e n c e pat l ;er i ts~ &ul ;o rn&t iea l |y acq u i r e d from MI{Ds, whose single e lements are ex-

